Fine-Grained Head Pose Estimation Without Keypoints
نویسندگان
چکیده
Estimating the head pose of a person is a crucial problem that has a large amount of applications such as aiding in gaze estimation, modeling attention, fitting 3D models to video and performing face alignment. Traditionally head pose is computed by estimating some keypoints from the target face and solving the 2D to 3D correspondence problem with a mean human head model. We argue that this is a fragile method because it relies entirely on landmark detection performance, the extraneous head model and an ad-hoc fitting step. We present an elegant and robust way to determine pose by training a multi-loss convolutional neural network on 300W-LP, a large synthetically expanded dataset, to predict intrinsic Euler angles (yaw, pitch and roll) directly from image intensities through joint binned pose classification and regression. We present empirical tests on common in-the-wild pose benchmark datasets which show state-of-the-art results. Additionally we test our method on a dataset usually used for pose estimation using depth and start to close the gap with state-of-the-art depth pose methods. We open-source our training and testing code as well as release our pre-trained models.
منابع مشابه
A reduced feature set for driver head pose estimation
Evaluation of driving performance is of utmost importance in order to reduce road accident rate. Since driving ability includes visual-spatial and operational attention, among others, head pose estimation of the driver is a crucial indicator of driving performance. This paper proposes a new automatic method for coarse and fine head’s yaw angle estimation of the driver. We rely on a set of geome...
متن کاملMulti-Scale Structure-Aware Network for Human Pose Estimation
We develop a robust multi-scale structure-aware neural network for human pose estimation. This method improves the recent deep conv-deconv hourglass models with four key improvements: (1) multi-scale supervision to strengthen contextual feature learning in matching body keypoints by combining feature heatmaps across scales, (2) multi-scale regression network at the end to globally optimize the ...
متن کاملFine-grained Visual Categorization using PAIRS: Pose and Appearance Integration for Recognizing Subcategories
In Fine-grained Visual Categorization (FGVC), the differences between similar categories are often highly localized to a small number of object parts (see Figure 1), and significant pose variation therefore constitutes a great challenge for identification. To address this, we propose extracting image patches using pairs of predicted keypoint locations as anchor points. The benefits of this appr...
متن کاملCascaded Pyramid Network for Multi-Person Pose Estimation
The topic of multi-person pose estimation has been largely improved recently, especially with the development of convolutional neural network. However, there still exist a lot of challenging cases, such as occluded keypoints, invisible keypoints and complex background, which cannot be well addressed. In this paper, we present a novel network structure called Cascaded Pyramid Network (CPN) which...
متن کاملEvaluation of Head Pose Estimation for Studio Data
This paper introduces our head pose estimation system that localizes nose-tip of the faces and estimate head poses in studio quality pictures. After the nose-tip in the training data are manually labeled, the appearance variation caused by head pose changes is characterized by tensor model. Given images with unknown head pose and nose-tip location, the nose-tip of the face is localized in a coa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1710.00925 شماره
صفحات -
تاریخ انتشار 2017